NameSpaces, for a better comprehension of containers¶
For this topic, we assume that you have knowledge on Linux system, that you heard about shared segment memory, sockets ... It'll also be easier for you to follow if you played with CGroups on your computer.
An other pre requisite, and this one is an heavy one : there is no relations between VIRTUALIZATION and NAMESPACES (So that apply to container !).
Namespace is a useful feature that has been released recently (2002). It will allow you to create a kind of parallel worlds on top of same kernel, where each world are isolated.
Namespaces have a negligible overhead, it's very powerful to run task on a dedicated domain (concept of isolation) with same performance as a native OS.
The creation of namespaces is exclusively reserved to priviledge users and you have to keep in mind that Kernel see everything !
List of NameSpaces¶
Linux provides the following namespaces:
- NS :
- Create mount point on different file system
- Ephemeral file system
- Mount read only partitions
- PID :
- Can create a pid 1
- Limit the visibility of other process in the system
- Allow to prevent interaction with signals between process of a different namespace
- Allow to prevent consultations of process information thanks to
- NET :
- Give to process a proper
- Give access to network interfaces (virtual or physical)
- Own routing tables, firewall, IP rules
- Give to process a proper
- CGROUP :
- Better isolation of exposed informations by
- Abstract the system herberging the container
- NS CGROUP, not CGROUP which are another Linux fonctionnality
- Better isolation of exposed informations by
- IPC :
- Isolation of communications mechanism between process give by the kernel
- Semaphore / Shared Memory / Queue / Socket
- USER :
- The most controversed
- Increase significatively the security
- Could augment the kernel attack surface
- Run a process in a container with root priviledge
- UTS :
- output a different hostname to process
Example on Network (NET) NameSpaces¶
In this short example, we are going to play with network namespaces. The main ojective is to create a bridge between the host and the name space in order to ping each other.
Creating a network namespace :
$ cd ~ $ mkdir example-netns && cd example-netns $ sudo ip netns add lapin
List all existing network namespaces :
$ sudo ip nets list
Executes command line in a network namespaces :
Firts you can directly type your command in the
# sudo ip netns exec <your command> $ sudo ip netns exec lapin ip -c a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
Or, you can even open a bash to enter in the namespace :
$ sudo ip netns exec bash
Enable the link of
loto up in the namespace :
$ sudo ip netns exec lapin ip link set dev lo up $ sudo ip netns exec lapin ip -c a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever
Create a bridge :
$ sudo ip link add veth-0 type veth peer name veth-1 $ ip -c a ... 7: veth-1@veth-0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether e6:bc:f0:b2:28:4a brd ff:ff:ff:ff:ff:ff 8: veth-0@veth-1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 06:58:9c:fc:f3:3d brd ff:ff:ff:ff:ff:ff
Add one of the created virtual interface to the namespace
$ sudo ip link set veth-1 netns lapin $ sudo ip netns exec lapin ip -c a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 7: veth-1@if8: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether e6:bc:f0:b2:28:4a brd ff:ff:ff:ff:ff:ff link-netnsid 0
Give ip addresses to virtual interfaces :
To the host :
$ sudo ip addr add 22.214.171.124/24 dev veth-0 $ sudo ip link set dev veth-0 up $ ip -c a ... 8: veth-0@if7: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000 link/ether 06:58:9c:fc:f3:3d brd ff:ff:ff:ff:ff:ff link-netns lapin inet 126.96.36.199/24 scope global veth-0 valid_lft forever preferred_lft forever
To the namespace :
$ sudo ip netns exec lapin ip add addr 188.8.131.52/24 dev veth-1 $ sudo ip netns exec lapin ip link set dev veth-1 up $ sudo ip netns exec lapin ip -c a ... 7: veth-1@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether e6:bc:f0:b2:28:4a brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet 184.108.40.206/24 scope global veth-1 valid_lft forever preferred_lft forever inet6 fe80::e4bc:f0ff:feb2:284a/64 scope link valid_lft forever preferred_lft forever
Test your network :
Ping the namespace from the host :
$ ping -c 4 220.127.116.11 PING 18.104.22.168 (22.214.171.124) 56(84) bytes of data. 64 bytes from 126.96.36.199: icmp_seq=1 ttl=64 time=0.111 ms 64 bytes from 188.8.131.52: icmp_seq=2 ttl=64 time=0.091 ms 64 bytes from 184.108.40.206: icmp_seq=3 ttl=64 time=0.121 ms 64 bytes from 220.127.116.11: icmp_seq=4 ttl=64 time=0.069 ms --- 18.104.22.168 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 74ms
Ping the host from the namespace :
$ sudo ip netns exec lapin ping -c 4 22.214.171.124 PING 126.96.36.199 (188.8.131.52) 56(84) bytes of data. 64 bytes from 184.108.40.206: icmp_seq=1 ttl=64 time=0.044 ms 64 bytes from 220.127.116.11: icmp_seq=2 ttl=64 time=0.108 ms 64 bytes from 18.104.22.168: icmp_seq=3 ttl=64 time=0.114 ms 64 bytes from 22.214.171.124: icmp_seq=4 ttl=64 time=0.118 ms --- 126.96.36.199 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 100ms
To delete the created network namespace :
$ sudo ip netns del lapin
Example on Mount (NS) NameSpaces¶
In this short part, we are going to create a directory on the host ant then, we will mount it in a namespace and inspect what happened.
So first let's create a directory under your user directory.
$ mkdir /home/pfontaine/ns_mnt_tuto
Now we are going to create a namespace using the
unshare - run program with some namespaces unshared from parent
And we're going to use the
-m option to mount
/bin/bash in our new namespace
sudo unshare -m /bin/bash
Now let's inspect if namespaces inode are different in both host and namespace. For that we use the
# On Namespace $ readlink /proc/$$/ns/mnt mnt: # On Host $ readlink /proc/$$/ns/mnt mnt:
So here is a way to troubleshoot and check if namespace work as it should.
Now we're going to use
df utility, a good tool in order to inspect what is mount and have some stats on the usage for the differents results.
We can check what is mount in our namespace.
# On Namespace df -h Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur ... tmpfs 7,7G 2,8M 7,7G 1% /run tmpfs 1,6G 32K 1,6G 1% /run/user/42 tmpfs 1,6G 112K 1,6G 1% /run/user/1000 tmpfs 7,7G 170M 7,6G 3% /tmp ...
As far as this point, all the host system seems to be mount in this namespace.
What is happening if we unmount the
tmp file system from our namespace ?
# On Namespace $ umount -t tmpfs /tmp
# On Namespace $ df -h | grep /tmp # No results ....
So it seems to be correct from our namespace point of view, what about the host point of view ?
# On Host $ df -h | grep /tmp tmpfs 7,7G 191M 7,5G 3% /tmp
Oh great ! It still mounted ! So we clearly see that what happen in the namespace doesn't affect the host.
Now we're able to mount a new
/tmp that will not affect our host. And you can do as well with other directories, even recreate a full OS tree like.
# On Namespace $ mount -n -t tmpfs tmpfs /tmp
Do you remember the directory we've been creating at the beginning ? We're going to mount it and see what happen in both world.
# On Namespace $ mount -n -t tmpfs tmpfs /home/pfontaine/ns_mnt_tuto/
# On Namespace df -h | grep ns_mnt_tuto tmpfs 7,7G 0 7,7G 0% /home/pfontaine/ns_mnt_tuto
# On Host $ df -h | grep ns_mnt_tuto # No results ...
On the way to containerization¶
With those two examples we hope that you have a better understanding of what is happening under the hood with containers ! You can go further by testing the other namespaces, the command
unshare has many other arguments for UTS, IPC, PID ... On this side the Unix Manual is your friend !
Isolation is a critical part of security and namespaces is a very powerful and useful tool for that. But don't forget, Namespaces are less secure than they look, you have to try and check sides effects of your configuration. You need to pay attention to details and never forget that it still evolving ! But if you are smart (sure you are !) and well documented, Namespaces is one of the most complex protection facility on Linux Kernel.